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ABSTRACT 



This paper examines two fields that contribute to research 
on digital libraries- - inf ormation systems and orality-literary studies- -and 
applies them to a particular digital library domain, botanical taxonomic 
work. Topics discussed include: (1) an introduction to HOSS (i.e., a 

computationally-oriented hypermedia system) architecture, including the 
hyperbase layer, structure processing layer, metadata manager layer, 
application layer, and other tools; (2) an introduction to orality, literacy, 
and hyperliteracy; (3) botanical taxonomic scholarship; (4) information 
systems technology applications, including single/multiple taxonomies, 
ownership of taxonomies, and definition of taxonomies; and (5) hyperliterate 
work practices, including single/multiple taxonomies, ownership of 
taxonomies, and definition of taxonomies. A table presents examples of 
differences between orality and literacy. Contains 19 references. 
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Designing Digital Libraries for Post-literate 

Patrons 

Peter J. Niimberg, Erich R. Schneider, John J. Leggett 
Center for the Study of Digital Libraries, Texas A&M University, USA 
email: ( pnuem. erich. leggett l@csdl.tamu.edu 

Abstract: Many Web and Internet technologies have traditionally been used to serve information stores 
across machines and between people. There has been a great deal of recent interest in using these 
information services to support digital libraries. Digital libraries is an interdisciplinary research effort 
that must synthesize existing research from highly disparate fields. This paper examines two such 
contributing fields - information systems and orality-literary studies - and applies them to a particular 
digital library domain, botanical taxonomic work. In trying to build digital libraries for botanical 
taxonimists, we show how two widely differing fields each can provide part of a solution neither can 
provide alone. 

0. Description of the Paper 

Many Web and Internet technologies have traditionally been used to serve information stores across 
machines and between people. There has been a great deal of recent interest in using these information 
services to support digital libraries. Research in digital libraries, as any interdisciplinary endeavor, is 
confounded by the fact that one must consider and synthesize from several fields, including ones that are 
perhaps unfamiliar. This paper seeks to tie together two very different fields - information systems and 
orality-literacy studies - that each have something to offer the digital library designer. The authors have 
chosen an unconventional format for presenting this material. The paper contains two threads for the two 
fields it draws upon. Some sections belong in only one thread, while others belong in both. The paper 
can be read in different ways, but most people will find it easiest to read the thread with which they are 
more familiar first, in order to contextualize the material, and then delve into the other thread. Figure 1 
below illustrates the organization of the paper. (Note: the information systems thread should be read 
from the left column and the orality-literacy thread from the right). 



Figure 1: Organization of the paper 
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IS1. Introduction 



OL1. Introduction 



For many reasons, archaic work practices 
of varying "inappropriateness" to 
modem scholarship linger on despite 
their known flaws. In 
information-intensive fields, 
considerable support for the 
development of new work practices can 
be provided by digital libraries and the 



For many reasons, archaic work practices of varying 
"inappropriateness" to modem scholarship linger on 
despite their known flaws. In information-intensive 
fields, the derivation of possible new work practices 
can be suggested by differentiating those aspects of 
current practice that are archetypic to the problem 
addressed from those artifactual to the technologies 
currently employed. In particular, orality-literacy 
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technologies underlying them. In 
particular, advanced distributed, 
computationally-oriented hypermedia 
systems, with their capability to support 
more fluid information structures, have 
often been proposed for use in fields 
where the mutable cognitive artifacts that 
scholars employ are known to be poorly 
reflected in the static artifacts produced 
by pre-electronic work practices for 
pre-electronic distribution methods. 

IS2. HOSS Architecture 

HOSS is a computationally-oriented 
hypermedia system [NUmberg et al. 
1996]. It consists of a hyperbase layer, a 
structure processing layer, a metadata 
manager layer, and an application layer. 
Each of these will be briefly described 
below. 



studies are here proposed for this pupose in fields 
where the mutable cognitive artifacts that scholars 
employ are known to be poorly reflected in the 
static artifacts produced by pre-electronic work 
practices for pre-electronic distribution methods. 

OL2. Orality, Literacy, and 
Hyperliteracy 

Since the 1960s an interdisciplinary research area 
within the humanities known as orality-literacy 
studies has existed, concerned with differences in 
the modes of thought and expression exhibited by 
individuals in cultural situations which exhibit 
primary orality (where writing is not used as an 
adjunct to thought and memory) and those 
exhibiting pervasive literacy (where it has become 
indispensable for those activities). 

OL2.1 Orality and Literacy 



The main difference between HOSS and 
other hypermedia systems is that HOSS 
is an entire operating environment. It 
provides file system, memory 
management, and scheduling features. 
Other operating system functionality is 
provided by a SunOS 5.4 kernel. HOSS 
is best thought of as a hypermedia-aware 
operating system. An immediate result of 
this is that HOSS, as any operating 
system, admits an open set of application 
processes. Furthermore, just as all 
applications in a real-time operating 
system may take advantage of real-time 
awareness on the part of the operating 
system, all HOSS applications have 
immediate access to hypermedia 
functionality. The functionality of the 
hyperbase and (open) structure 
processing layer is available to all HOSS 
processes. 



A seminal work in orality-literacy studies is Preface 
to Plato by classicist Eric Havelock [1963], whose 
starting point is Plato's attack on poetry in the 
Republic [Waterfield 1993]. Plato's proposal that 
poetry be banned from his ideal state, because it 
degraded the intellect, is found odd by many 
modem students of Plato. Havelock sets out to 
examine what this apparent oddity in the 
philosopher's thought implies about the cultural 
situation of Plato's Greece. 

Havelock contends the extensive ground of common 
knowledge and worldviews required by classical 
Greek culture was encoded in the great poems of the 
time, most notably Homer's epics. To the ancient 
Greeks, these were a "tribal encyclopedia" of 
cultural ways and norms. Poetry was also 
well-suited to the problems of information storage 
in a non-literate culture, namely retention in living 
memory and content-preserving transmission 
[Havelock 1963]. In essence, recitation of the epics 
was able to induce in reciters and listeners an almost 
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152.1 Hyperbase Layer 

A HOSS hyperbase is a process with two 
threads: a Versioned Object Manager 
(VOM) and an Association Set Manager 
(ASM). The VOM acts as a client of 
some Storage Manager (SM) that exists 
outside of the hyperbase. The VOM 
serves simple object and composite 
object abstractions and provides full 
versioning support for both [Hicks 
1993]. The ASM is implemented as a 
client of the VOM, mapping the VOM 
abstractions to structural entity 
abstractions called associations and 
association sets [Leggett and Schnase 
1994; Schnase 1992]. Because the ASM 
is a client of the VOM, it inherits 
versioning support for its abstractions as 
well. 

A HOSS hyperbase is conceptually 
similar to other hyperbase systems 
[Leggett and Schnase 1994; Schnase 
1992; Shackelford et al. 1993; Schtitt 
and Streitz 1990; Wiil 1993]. 

152.2 Structure 
Processing Layer 

HOSS allows an open set of structure 
processors called Sprocs. All Sprocs are 
clients of the ASM. The difference 
between Sprocs lies in the kinds of 
structure they manipulate. A key aspect 
to HOSS Sprocs is that they abstract 
behavior from structure [Numberg et al. 
1996], 

One example of a HOSS Sproc is the 
Link Services Manager (LSM). The 
LSM manages traditional 2 hypermedia 
structure - namely, inter-application 
linking structure. It provides functions to 
create, navigate, manipulate, and destroy 
structure between application data. In the 
case of the LSM, behaviors correspond 
to the semantics of particular 
navigational structure traversals. 

The Taxon Manager (TaxMan) provides 
a second example of a HOSS Sproc. 
TaxMan acts as a client of both the 
VOM and the ASM, and serves 
taxonomic structural abstractions. These 
taxonomic abstractions are widely 
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hypnotic state that assisted correct remembrance. It 
also encoded cultural knowledge situationally. Both 
of these were anathema to Plato, who was 
promoting reflective thought on the nature of 
abstracts. Plato's literacy allowed him to encode 
knowledge externally as a thing "in itself' and 
allowed him to examine concepts and their abstract 
structures without forgetting them. Thus, Havelock 
concludes, arises Plato‘s excoriation of poetry as 
education method, as inhibitor of abstract 
speculation on the nature of the true, good, and 
beautiful. For our purposes, we note that Havelock 
showed the consideration of ideas as eternal "things 
in themselves" is an artifact of literacy, not an 
archetypic aspect of thought. 



Table 1: Examples of Differences Between 
Orality and Literacy. 





Orality 


Literacy 


Ideas as... 

[Havelock 

1963] 


properties of 

concrete 

situations 


abstract and { 

eternal "things j 
in themselves" 1 


Socially 
relevant 
truths as... 
[Ong 1982] 


mutable objects 


1 

fixed objects 


Language 
use as... 
[Ong 1982] 


requiring 
consideration of 
situation 


manipulation 
of abstract 
placeholders 



Among other artifactual properties of literacy 
(examined in another seminal work of the field, 
Walter Ong's Orality and Literacy [Ong 1982]) is 
the notion of written truth as permanent truth. 

Today, it is common for material to be written down 
and remain unchanged for extended periods of time. 
If that material had some veracity when it was 
recorded, we tend to regard its "truth" as a 
permanent property that can be redemonstrated at 
any time. This is not the case with orally transmitted 
knowledge, which cannot be "recorded" except in 
living memory. As a result, material for which there 
is no call is forgotten, and changes to the material 
that give advantage will occur. Revisionism is 
reality in primary oral cultures; the beliefs that the 
written retains its truth for all time and that, by 
extension, publication implies truth are artifacts of 
literacy. 

OL2.2 Hyperliteracy 

Many believe that we are entering an era where 
electronic tools for storing and manipulating 
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taxonomic abstractions are widely 
applicable. For example, botanical 
taxonomists use abstractions such as 
family, genus, species, etc. to classify 
plant specimens. Also, linguists develop 
linguistic taxonomies to represent the 
developmental histories of different 
languages. 

Additionally, the TaxMan provides a 
number of standard computations over 
taxonomic structures (i.e. behaviors). 
Some examples of these behaviors 
include structure querying (e.g. find all 
family taxa that contain four genera with 
only one species each) and Structure 
collapsing 2 (e.g. collapsing species, 
subspecies, section, etc. into the genera 
taxa and transferring the associations 
between specimen data and these 
collapsed taxonomic levels to the 
genera.) 

152.3 Metadata Manager 
Layer 

Metadata managers are system processes 
that primarily serve abstractions to other 
system processes. They build the 
abstractions they serve from abstractions 
served by other metadata managers, 
Sprocs, and hyperbases. Metadata 
managers can be viewed as abstract data 
types, exporting data and functional 
abstractions. 

152.4 Application Layer 

Application processes are user processes 
familiar from conventional operating 
systems. The nature of these processes is 
open. One example of an application that 
has been built is a WWW Common 
Gateway Interface (CGI) [Berners-Lee et 
al. 1992] program that acts as a client to 
the TaxMan, allowing queries to be 
made over a taxonomic space, displaying 
the results, and allowing users to 
annotate the records displayed in answer 
to the query. Another example is a 
Motif/X [Nye 1988; Young 1990] client 
that allows graphic editing and 
manipulating of taxa. 

152.5 Other Tools 



electronic tools for storing and manipulating 
information will be considered indispensable for 
everyday thinking and remembering. Douglas 
Engelbart [1963] expressed this belief when he 
described a "certain progression of our intellectual 
capabilities", from concept manipulation 
(manipulation of concepts in the mind alone) to 
symbol manipulation (expression of concepts 
through language) to manual external symbol 
manipulation (manipulating linguistic symbols 
using writing) and finally to automated external 
symbol manipulation (manipulation of symbols 
using computers). Engelbart's second stage 
corresponds with the concept of "primary orality", 
and his third stage with "pervasive literacy". We 
extend the concept of orality and literacy by 
positing a new property of culture, pervasive 
hyperliteracy or simply hyperliteracy, 
corresponding to Engelbart's fourth stage. 

Why posit hyperliteracy? If we are indeed entering 
an era where automated external symbol 
manipulation tools have become prerequisites of 
serious thought, then the designers of such tools 
should be interested in which aspects of thought are 
intrinsic to language-using human beings and which 
aspects are products of the use of non-electronic 
writing, since some of the latter may decrease in 
strength or disappear altogether in the residents of 
this new era. As can be seen from the above, these 
artifactual properties are not trivial, and they are 
precisely the concern of orality-literacy studies. 
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A number of tools have been built for 
application, metadata manager, and 
Sproc construction [Niimberg 1994]. The 
HCMT and HPMT toolkits provide 
certain process model and inter-process 
communication primitives. A tool called 
the PDC allows quick construction of 
servers by generating the necessary 
protocol libraries from high-level 
protocol specifications. 



3. Botanical Taxonomic Scholarship 

A curious aspect of some scholarly work practices is that often, these practices are known to depend on 
false assumptions or over-simplifications of a problem. In some cases, such as in certain economic 
models, these false assumptions are taken as reasonable because they produce good results and make the 
models tractable. 

In other cases, however, these false assumptions are simply products of tradition, based in part on 
artifacts of old technology and literate mindsets. We take as one very specific example our experiences 
with botanical taxonomists. For several years, we have worked together with botanists to build a digital 
library of herbarium collection data. We have been able to observe several common current work 
practices that have changed as our botanist colleagues both gain access to new technology and 
re-evaluate those parts of their old technology that dictated how they did their jobs. As a particularly 
good example of a current work practice dictated by current technology, consider that there are journals 
that use taxonomies that everyone (including the journal editors!) acknowledges are outdated. The 
editors of the journal, however, are reluctant to correct the errors in this standard taxonomy, partly 
because the fixes are not universally agreed upon, but also because changing the taxonomy now would 
"invalidate" articles just published. The current common practice, then, is for researchers to carry out 
their work using a more realistic taxonomy, and then literally "uncorrect" their terms to match the 
journal standard. 

For reference, the object of taxonomic classification is the taxonomy, which consist of taxa, which 
themselves consist of other taxa or specimens. Taxa are composed in a hierarchic fashion. Taxa at 
different levels in the tree have different names, such as family, genus, species, etc. We briefly describe 
three interesting problems we observed the taxonomists encounter in their current work practices. 

Different groups of taxonomists produce different taxonomies, even if the specimen set examined is 
identical. Groups in which particular specialists work on a given taxon may show more detail in the 
expansion of that taxon, or different groups may use different measures of similarity when composing 
taxa, weighting various kinds of evidence differently. It seems contradictory to have multiple solutions 
to a classification problem. 

Separate taxonomic groups produce separate taxonomies, which are then identified by the groups that 
produced them. This is despite the fact that it may always be used in conjunction with other taxonomies, 
or that it is based on the prevailing attitudes in the community. It seems contradictory that a communally 
defined, communally used product is identified with a small set of taxonomists. 

The products of the work are often taxonomies, not simply revisions to existing taxonomies. Whether 
updates or new full revisions, the products are viewed as a closed, well-defined entities, representing an 
opinion of a group at some time. However, new evidence, new analysis methods, and new 
interpretations are constantly being introduced. It seems contradictory to produce a well-defined, static 
analysis of an ill-defined, dynamic phenomenon. 
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IS4. Technology OL4. Hyperliterate Work 

Applications Practices 



Addressing the three examples of seeming 
contradictions in current work practices 
requires different supporting technologies 
than those present in the physical library. 
What is required here are new digital library 
elements and tools, not derived from physical 
antecedents. Of course, it is impossible to say 
what all of these technologies will be. This 
section outlines some possible technologies 
to begin to address these issues. 

IS4.1 Single/Multiple 
Taxonomies 

Two important capabilities that helps address 
single/multiple taxonomies problems are 
structure management and versioning. 
Hypertext structure management abstracts the 
structure over objects from the objects 
themselves. Oftentimes, this takes the form of 
abstracting traversal or navigational structure 
from data to be navigated. However, the 
principle of structure abstraction can be 
applied to any realm in which multiple 
structures may be applied to a given data set. 
This is precisely the case in taxonomic work. 
Different taxonomies (structures) are built 
over the same specimen (data) set. Because 
the TaxMan inherits the structure 
management abstractions of HOSS, including 
contexts (sets of structure elements and their 
associated behavior processes), it can use 
these contexts to partition the taxonomic data 
into consistent taxa sets. 

Because the TaxMan is implemented on top 
of HOSS, it inherits the versioning support 
for both data and structural objects therein. 
This provides a natural way to model 
difference over time in a given taxonomy, as 
well as differences with respect to authority 
in the same time frame. Additionally, changes 
in the analysis of specimens (perhaps the 
addition of new pictures or new genetic 
information) can be added to the data set by 
versioning the appropriate specimen data 
object, thereby not invalidating taxonomies 



Addressing the three examples of seeming 
contradictions in current work practices 
requires different artifacts than those present in 
the physical library with its literate artifacts. 
What is required here are new digital library 
elements and tools, not derived from physical 
antecedents [Niimberg et al. 1995]. Of course, 
it is impossible to say what all of these artifacts 
will be. This section outlines some possible 
artifacts to begin to address these 
contradictions. 

OL4.1 Single/Multiple 
Taxonomies 

One artifact of literacy is the notion of 
single-valued, static truths [Ong 1982]. The 
work practice of developing and publishing 
taxonomies separately from one another is a 
particular instantiation of this artifact. The 
product of this work is a taxonomy, a 
"taxonomic fact" or truth, presented and 
interpreted as such. However, the notion of 
truth is changing from the literate view of static 
and single-valued to the hyperliterate view of 
dynamic and multi-valued. Consider the Guides 
project approach to teaching history in which 
various persona contextualize history from a 
particular point of view [Solomon et al. 1989]. 
The "truth" of the matter is a space, in which 
various points of view are represented. This 
contrasts sharply with the notion of the 
authority of the book as conveyor of a single, 
coherent message as in the literate world 
[Chartier 1994]. Perhaps instead of viewing the 
primary goal of a taxonomist as the generation 
of a new taxonomy, which then must be related 
to previous and competing taxonomies by the 
consumer, the product may be viewed as a 
change to the existing body of knowledge. In 
fact, in essence, taxonomists do view the 
purpose of their work in this way, but the 
actual product of their work, the printed 
taxonomy, is only a means to this end. 
Reconciliation and contextualization is the 
responsibility of the consumer. 
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based on the older version of the object. 

154.2 Ownership of 
Taxonomies 

One important capability that helps address 
ownership of taxonomies problems is 
annotation support. An important aspect of 
maintaining and using community objects is 
annotating and sharing annotations over 
community objects. Such annotations can be 
used to judge the communal level of 
acceptance of a part of the community body 
of knowledge or other particularly 
noteworthy aspects. Moderately sophisticated 
access control, search facilities and filtering 
mechanisms over the annotation space should 
be provided. We have developed a HOSS 
Sproc named AnnoMan, which models sets 
of annotations as structure contexts and 
provides these features. Modeling annotations 
as structure in a hyperbase is straightforward 
- different structural elements (annotations) 
are laid over existing taxa and specimens 
(data), grouped into contexts, and managed 
by existing hyperbase software that can 
provide access control. 

154.3 Definition of 
Taxonomies 

Another important capability that helps 
address definition of taxonomies problems is 
computation over hypermedia structure. The 
nature of the information in taxonomic 
research may be open in the sense that the 
boundaries around it may be hard to define, 
especially outside of a particular context. 
However, dealing with documents that 
exhibit no sense of closure at all can be 
disorienting as well. What is needed is a way 
in which the open space can be viewed as 
only "partially" open - that is, enforcing some 
sort of boundaries appropriate in a context, 
but allowing these boundaries to be crossed 
or recomputed. One way in which to do this 
is to take advantage of computation over 
structure which dynamically generates closed 
sets of structure appropriate for a particular 
use. 



5. Conclusions 



o 




OL4.2 Ownership of 
Taxonomies 

Literacy promotes the concept of idea 
ownership by the individual, even when the 
idea represents a communally held truth. In this 
case, taxonomies are identified with their 
producers or publishers. There is no way to 
recognize the contextualization of a taxonomy 
in itself. Howevere, the notion of authorship is 
changing from owner of a document and by 
extension its ideas to recorder of ideas that are 
the product of several people, past and present. 
Consider an analogy from the business world - 
the growing role of the analyst [Reich 1991], 
The analyst provides a filtering or ordering 
function for data that is oftentimes already 
available. Many new companies focus no 
longer in the production of information, but its 
compilation. This reflects a situation in which 
the problem of information is what to do with 
the overabundance of it (the "information 
explosion"), and not how to find and retrieve 
data [Chartier 1994], 

OL4.3 Definition of 
Taxonomies 

One artifact of literacy is closure of ideas. The 
product of taxonomic work is a well-defined, 
discrete entity. Products no longer must be 
closed. They may exist as changing entities 
over time, with poorly defined borders. 
Consider Web sites with links to many other 
sites. These sites have no closure per se. Where 
one chooses to draw boundaries is contextually 
and individually defined. This is in opposition 
to the closure engendered by books and other 
written entities [Chartier 1994]. As above, one 
new possibility is a communally maintained set 
of taxa, with various notes, modifications, and 
addenda separately maintained over these taxa. 
The boundaries of the communal knowledge 
could only be determined by a given consumer 
at a given moment. 
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The information systems thread of this paper asserted the existence of new work practices in botanical 
taxonomic scholarship enabled by new technologies. The new work practices, however, were assumed to 
arise spotaneously due to problems found in current work practices. 

The orality-literacy thread of this paper motivated why certain new work practices might arise in 
botanical taxonomic scholarship, but did not offer any particular ways to cope with them. 

The digital library will have to support the new work practices of people. The changes in such practices 
must be identified. We extended orality-literacy to hyperliteracy in an attempt to characterize the 
changes. The new practices will have to be supported by new technologies. We showed systems and 
tools able to support the needs of one particular research community. The threads in this paper, 
therefore, must rely upon one another, one for motivation, the other for prototypic solutions. We see this 
as a microcosm of the digital libraries research field - a field in which results from many different and 
dissimilar areas will need to be synthesized to produce the research necessary to redesign the tools with 
which people think. 
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